fix(dr): fully qualify Velero resource names in the rebuild workflow#2031
Merged
Conversation
Applies the lesson from the CI restore drill's first run (#1995): CNPG also defines a 'backups' resource, and kubectl resolves the unqualified 'backups'/'restore' names to the wrong API group on this cluster. In a real DR the backup-discovery loop would have listed CNPG Backup CRs, found no Completed ones, and aborted the whole restore with 'No Completed Velero backup found'. Every velero get/describe/wait now uses backups.velero.io / restores.velero.io / backupstoragelocations.velero.io. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
|
🎉 This PR is included in version 1.52.2 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Applies the lesson from the restore drill's first real run on #1995 to the merged DR rebuild workflow (#1997): CNPG also defines a
backupsresource, and kubectl resolves unqualifiedbackups/restorenames topostgresql.cnpg.ioon this cluster, notvelero.io.In a real disaster, the merged workflow's backup-discovery loop would have listed CNPG Backup CRs, found no
Completedones (wrong phase semantics entirely), and aborted the whole restore withNo Completed Velero backup found— after the cluster rebuild had already succeeded. The restore-phase polls had the same problem.Every velero
get/describe/waitin the workflow now uses fully-qualified names (backups.velero.io,restores.velero.io,backupstoragelocations.velero.io), with a comment documenting the collision. Same fix already verified green in the drill on #1995 (backups.velero.io/dr-drill: Completed).Note: a stray push briefly recreated the merged PR's deleted branch (
claude/dr-rebuild-workflow) before I noticed #1997 had merged — that branch has been deleted again; this PR is the proper carrier for the fix.Validation
bash -n.🤖 Generated with Claude Code